Skip to content

Multi split diff#4281

Merged
sougou merged 6 commits intovitessio:masterfrom
systay:multi-split-diff
Dec 18, 2018
Merged

Multi split diff#4281
sougou merged 6 commits intovitessio:masterfrom
systay:multi-split-diff

Conversation

@systay
Copy link
Copy Markdown
Collaborator

@systay systay commented Oct 17, 2018

This replaces #3781

A new implementation of split diff that diffs against all the
destination servers at the same time. Since we only have to read out of
the source a single time this could give significant time savings.

@tirsen tirsen mentioned this pull request Oct 17, 2018
@systay systay force-pushed the multi-split-diff branch 2 times, most recently from 06b0927 to b3a2db4 Compare November 1, 2018 15:31
@systay systay requested a review from sougou as a code owner November 22, 2018 10:15
tirsen and others added 2 commits November 29, 2018 09:50
A new implementation of split diff that diffs against all the
destination servers at the same time. Since we only have to read out of
the source a single time this could give significant time savings.

Signed-off-by: Jon Tirsen <jontirsen@squareup.com>
* Extracted a bunch of methods
* Instead of mutexes, use channels
* Check UIDs in separate channel
* Test if MSD can work with the given tables
* Remove destination tablet type that was not used anyway
* Make tests run MultiSplitDiff as well
* Make the unit test run MSD

Signed-off-by: Andres Taylor <antaylor@squareup.com>
@systay
Copy link
Copy Markdown
Collaborator Author

systay commented Nov 29, 2018

@deepthi @sougou This PR is ready to review.

Copy link
Copy Markdown
Contributor

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally got around to reviewing this!

I'm assuming most of the low level logic is good, as it was copied from SplitDiff. The big concern is related to the handling of merges. It will be preferable to support it here. This will allow us to eventually deprecate SplitDiff. The other option will be to detect it and error out saying that MultiSplitDiff does not support merges.

The amount of logic duplicated from SplitDiff is a concern, but acceptable.

keyspaceInfo *topo.KeyspaceInfo
shardInfo *topo.ShardInfo
sourceUID uint32
destinationShards []*topo.ShardInfo
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to assume that you are only doing splits. But we may also do a merge, in which case there will be multiple source shards, and a single destination.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this not a split_clone responsibility and not a diff thing? or do you mean that when you are merging, you want the capability of also being able to diff before you switch over?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify on this weird SplitDiff behavior:

  • It does work for splits and merges.
  • It always wants you to specify the destination shard.
  • In case of merges, the destination shard will have multiple sources. If so, you must specify the source UID. Here's an example command: vtworker $TOPOLOGY_FLAGS -cell zone1 -alsologtostderr SplitDiff -min_healthy_rdonly_tablets=1 -source_uid=2 customer/0

func (msdw *MultiSplitDiffWorker) StatusAsHTML() template.HTML {
state := msdw.State()

result := "<b>Working on:</b> " + msdw.keyspace + "/" + msdw.shard + "</br>\n"

This comment was marked as resolved.

return nil, fmt.Errorf("failed to get list of keyspaces: %v", err)
}

producers := sync.WaitGroup{}

This comment was marked as resolved.

Andres Taylor added 2 commits December 13, 2018 08:56
Signed-off-by: Andres Taylor <antaylor@squareup.com>
Signed-off-by: Andres Taylor <antaylor@squareup.com>
* upstream/master: (77 commits)
  Parents the parent pom of the standard sonatype oss pom for better defaults.
  Allow subclasses of GrpcClientFactory to modify channel builder
  local example: fix typos
  helm: add missing default
  Adds removed vschema
  Add scope to oauth token for gcs backup storage client
  Fix typo in comment
  Calls Unlock explicitly after potetiantly losing it
  shellcheck fixes
  codeclimate fix
  add mysql helper script to local example, remove calls to mysql from scripts, they will be executed by the user using lmysql.sh
  use auth=none for vtgate, move sql scripts to common location
  add vschema ddl syntax to add/remove tables from unsharded keyspaces
  update the show syntax to match the vschema ddl changes
  change the vschema ddl syntax to use `alter vschema ...`
  add back files required for test
  remove unused files, add sql scripts to show intermediate and final results of resharding
  add descriptive comments to scripts, add etcd option, set MYSQL_FLAVOR is it isn't already set
  remove unused scripts
  adapt helm example to run locally
  ...
keyspaceInfo *topo.KeyspaceInfo
shardInfo *topo.ShardInfo
sourceUID uint32
destinationShards []*topo.ShardInfo
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify on this weird SplitDiff behavior:

  • It does work for splits and merges.
  • It always wants you to specify the destination shard.
  • In case of merges, the destination shard will have multiple sources. If so, you must specify the source UID. Here's an example command: vtworker $TOPOLOGY_FLAGS -cell zone1 -alsologtostderr SplitDiff -min_healthy_rdonly_tablets=1 -source_uid=2 customer/0

Signed-off-by: Andres Taylor <antaylor@squareup.com>
@sougou sougou merged commit bbff624 into vitessio:master Dec 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants